47 research outputs found
Knowledge Infused Learning (K-IL): Towards Deep Incorporation of Knowledge in Deep Learning
Learning the underlying patterns in data goes beyond instance-based
generalization to external knowledge represented in structured graphs or
networks. Deep learning that primarily constitutes neural computing stream in
AI has shown significant advances in probabilistically learning latent patterns
using a multi-layered network of computational nodes (i.e., neurons/hidden
units). Structured knowledge that underlies symbolic computing approaches and
often supports reasoning, has also seen significant growth in recent years, in
the form of broad-based (e.g., DBPedia, Yago) and domain, industry or
application specific knowledge graphs. A common substrate with careful
integration of the two will raise opportunities to develop neuro-symbolic
learning approaches for AI, where conceptual and probabilistic representations
are combined. As the incorporation of external knowledge will aid in
supervising the learning of features for the model, deep infusion of
representational knowledge from knowledge graphs within hidden layers will
further enhance the learning process. Although much work remains, we believe
that knowledge graphs will play an increasing role in developing hybrid
neuro-symbolic intelligent systems (bottom-up deep learning with top-down
symbolic computing) as well as in building explainable AI systems for which
knowledge graphs will provide scaffolding for punctuating neural computing. In
this position paper, we describe our motivation for such a neuro-symbolic
approach and framework that combines knowledge graph and neural networks
Defining and Detecting Toxicity on Social Media: Context and Knowledge are Key
As the role of online platforms has become increasingly prominent for communication, toxic behaviors, such as cyberbullying and harassment, have been rampant in the last decade. On the other hand, online toxicity is multi-dimensional and sensitive in nature, which makes its detection challenging. As the impact of exposure to online toxicity can lead to serious implications for individuals and communities, reliable models and algorithms are required for detecting and understanding such communications. In this paper We define toxicity to provide a foundation drawing social theories. Then, we provide an approach that identifies multiple dimensions of toxicity and incorporates explicit knowledge in a statistical learning algorithm to resolve ambiguity across such dimensions
Characterization of Time-variant and Time-invariant Assessment of Suicidality on Reddit using C-SSRS
Suicide is the 10th leading cause of death in the U.S (1999-2019). However,
predicting when someone will attempt suicide has been nearly impossible. In the
modern world, many individuals suffering from mental illness seek emotional
support and advice on well-known and easily-accessible social media platforms
such as Reddit. While prior artificial intelligence research has demonstrated
the ability to extract valuable information from social media on suicidal
thoughts and behaviors, these efforts have not considered both severity and
temporality of risk. The insights made possible by access to such data have
enormous clinical potential - most dramatically envisioned as a trigger to
employ timely and targeted interventions (i.e., voluntary and involuntary
psychiatric hospitalization) to save lives. In this work, we address this
knowledge gap by developing deep learning algorithms to assess suicide risk in
terms of severity and temporality from Reddit data based on the Columbia
Suicide Severity Rating Scale (C-SSRS). In particular, we employ two deep
learning approaches: time-variant and time-invariant modeling, for user-level
suicide risk assessment, and evaluate their performance against a
clinician-adjudicated gold standard Reddit corpus annotated based on the
C-SSRS. Our results suggest that the time-variant approach outperforms the
time-invariant method in the assessment of suicide-related ideations and
supportive behaviors (AUC:0.78), while the time-invariant model performed
better in predicting suicide-related behaviors and suicide attempt (AUC:0.64).
The proposed approach can be integrated with clinical diagnostic interviews for
improving suicide risk assessments.Comment: 24 Pages, 8 Tables, 6 Figures; Accepted by PLoS One ; One of the two
mentioned Datasets in the manuscript has Closed Access. We will make it
public after PLoS One produces the manuscrip
Modeling Islamist Extremist Communications on Social Media using Contextual Dimensions: Religion, Ideology, and Hate
Terror attacks have been linked in part to online extremist content. Although
tens of thousands of Islamist extremism supporters consume such content, they
are a small fraction relative to peaceful Muslims. The efforts to contain the
ever-evolving extremism on social media platforms have remained inadequate and
mostly ineffective. Divergent extremist and mainstream contexts challenge
machine interpretation, with a particular threat to the precision of
classification algorithms. Our context-aware computational approach to the
analysis of extremist content on Twitter breaks down this persuasion process
into building blocks that acknowledge inherent ambiguity and sparsity that
likely challenge both manual and automated classification. We model this
process using a combination of three contextual dimensions -- religion,
ideology, and hate -- each elucidating a degree of radicalization and
highlighting independent features to render them computationally accessible. We
utilize domain-specific knowledge resources for each of these contextual
dimensions such as Qur'an for religion, the books of extremist ideologues and
preachers for political ideology and a social media hate speech corpus for
hate. Our study makes three contributions to reliable analysis: (i) Development
of a computational approach rooted in the contextual dimensions of religion,
ideology, and hate that reflects strategies employed by online Islamist
extremist groups, (ii) An in-depth analysis of relevant tweet datasets with
respect to these dimensions to exclude likely mislabeled users, and (iii) A
framework for understanding online radicalization as a process to assist
counter-programming. Given the potentially significant social impact, we
evaluate the performance of our algorithms to minimize mislabeling, where our
approach outperforms a competitive baseline by 10.2% in precision.Comment: 22 page